The impact of data replication on job scheduling performance in the Data Grid

نویسندگان

  • Ming Tang
  • Bu-Sung Lee
  • Xueyan Tang
  • Chai Kiat Yeo
چکیده

In the Data Grid environment, the primary goal of data replication is to shorten the data access time experienced by the job and consequently reduce the job turnaround time. After introducing a Data Grid architecture that supports efficient data access for the Grid job, the dynamic data replication algorithms are put forward. Combined with different Grid scheduling heuristics, the performances of the data replication algorithms are evaluated with various simulations. The simulation results demonstrate that the dynamic replication algorithms can reduce the job turnaround time remarkably. In particular, the combination of shortest turnaround time scheduling heuristic (STT) and centralized dynamic replication with response-time oriented replica placement (CDR RTPlace) exhibits remarkable performance in diverse system environments and job workloads. © 2005 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data Replication-Based Scheduling in Cloud Computing Environment

Abstract— High-performance computing and vast storage are two key factors required for executing data-intensive applications. In comparison with traditional distributed systems like data grid, cloud computing provides these factors in a more affordable, scalable and elastic platform. Furthermore, accessing data files is critical for performing such applications. Sometimes accessing data becomes...

متن کامل

An Efficient Data Replication Strategy in Large-Scale Data Grid Environments Based on Availability and Popularity

The data grid technology, which uses the scale of the Internet to solve storage limitation for the huge amount of data, has become one of the hot research topics. Recently, data replication strategies have been widely employed in distributed environment to copy frequently accessed data in suitable sites. The primary purposes are shortening distance of file transmission and achieving files from ...

متن کامل

Dynamic Replication based on Firefly Algorithm in Data Grid

In data grid, using reservation is accepted to provide scheduling and service quality. Users need to have an access to the stored data in geographical environment, which can be solved by using replication, and an action taken to reach certainty. As a result, users are directed toward the nearest version to access information. The most important point is to know in which sites and distributed sy...

متن کامل

A New Job Scheduling in Data Grid Environment Based on Data and Computational Resource Availability

Data Grid is an infrastructure that controls huge amount of data files, and provides intensive computational resources across geographically distributed collaboration. The heterogeneity and geographic dispersion of grid resources and applications place some complex problems such as job scheduling. Most existing scheduling algorithms in Grids only focus on one kind of Grid jobs which can be data...

متن کامل

Improving Data Grids Performance by Using Modified Dynamic Hierarchical Replication Strategy

Abstract: A Data Grid connects a collection of geographically distributed computational and storage resources that enables users to share data and other resources. Data replication, a technique much discussed by Data Grid researchers in recent years creates multiple copies of file and places them in various locations to shorten file access times. In this paper, a dynamic data replication strate...

متن کامل

Increasing performance in Data grid by a new replica replacement algorithm

Data Grid provides sharing services for very large data around the world. Data replication is one of the most effective approaches to reduce access latency and response time. In addition to the benefits, replication has costs such as storage and bandwidth consumption, especially when storage space is low and limited. Therefore,  the data replacement should be done wisely. In this p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Future Generation Comp. Syst.

دوره 22  شماره 

صفحات  -

تاریخ انتشار 2006